AITopics

Technology: Information Technology > Artificial Intelligence (0.39)

Neural Information Processing SystemsDec-25-2025, 09:13:14 GMT

Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning

In this paper, we target image-based person-to-person virtual try-on in the presence of diverse poses and large viewpoint variations. Existing methods are restricted in this setting as they estimate garment warping flows mainly based on 2D poses and appearance, which omits the geometric prior of the 3D human body shape.Moreover, current garment warping methods are confined to localized regions, which makes them ineffective in capturing long-range dependencies and results in inferior flows with artifacts.To tackle these issues, we present 3D-aware global correspondences, which are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies. Particularly, given an image pair depicting the source and target person, (a) we first obtain their pose-aware and high-level representations via two encoders, and introduce a coarse-to-fine decoder with multiple refinement modules to predict the pixel-wise global correspondence.

3d-aware global correspondence learning, hard-pose virtual try-on, name change, (5 more...)

Technology: Information Technology > Artificial Intelligence (0.62)

Neural Information Processing SystemsDec-23-2025, 19:26:35 GMT

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. Yet, as most try-on approaches fit in-shop garments onto a target person, they require the laborious and restrictive construction of a paired training dataset, severely limiting their scalability. While a few recent works attempt to transfer garments directly from one person to another, alleviating the need to collect paired datasets, their performance is impacted by the lack of paired (supervised) information. In particular, disentangling style and spatial information of the garment becomes a challenge, which existing methods either address by requiring auxiliary data or extensive online optimization procedures, thereby still inhibiting their scalability. To achieve a scalable virtual try-on system that can transfer arbitrary garments between a source and a target person in an unsupervised manner, we thus propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on. Specifically, to disentangle the style and spatial information of each garment, PASTA-GAN consists of an innovative patch-routed disentanglement module for successfully retaining garment texture and shape characteristics. Guided by the source person's keypoints, the patch-routed disentanglement module first decouples garments into normalized patches, thus eliminating the inherent spatial information of the garment, and then reconstructs the normalized patches to the warped garment complying with the target person pose. Given the warped garment, PASTA-GAN further introduces novel spatially-adaptive residual blocks that guide the generator to synthesize more realistic garment details. Extensive comparisons with paired and unpaired approaches demonstrate the superiority of PASTA-GAN, highlighting its ability to generate high-quality try-on images when faced with a large variety of garments(e.g.

garment, patch-routed spatially-adaptive gan, scalable unpaired virtual try-on, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.74)

arXiv.org Artificial IntelligenceSep-10-2025

Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?

Sun, Guangzhi, Manakul, Potsawee, Zhan, Xiao, Gales, Mark

Unlearning has emerged as a critical capability for large language models (LLMs) to support data privacy, regulatory compliance, and ethical AI deployment. Recent techniques often rely on obfuscation by injecting incorrect or irrelevant information to suppress knowledge. Such methods effectively constitute knowledge addition rather than true removal, often leaving models vulnerable to probing. In this paper, we formally distinguish unlearning from obfuscation and introduce a probing-based evaluation framework to assess whether existing approaches genuinely remove targeted information. Moreover, we propose DF-MCQ, a novel unlearning method that flattens the model predictive distribution over automatically generated multiple-choice questions using KL-divergence, effectively removing knowledge about target individuals and triggering appropriate refusal behaviour. Experimental results demonstrate that DF-MCQ achieves unlearning with over 90% refusal rate and a random choice-level uncertainty that is much higher than obfuscation on probing questions.

large language model, machine learning, natural language, (17 more...)

2505.02884

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

arXiv.org Artificial IntelligenceJul-10-2025

TPT-Bench: A Large-Scale, Long-Term and Robot-Egocentric Dataset for Benchmarking Target Person Tracking

Ye, Hanjing, Zhan, Yu, Situ, Weixi, Chen, Guangcheng, Yu, Jingwen, Zhao, Ziqi, Cai, Kuanqi, Ajoudani, Arash, Zhang, Hong

Tracking a target person from robot-egocentric views is crucial for developing autonomous robots that provide continuous personalized assistance or collaboration in Human-Robot Interaction (HRI) and Embodied AI. However, most existing target person tracking (TPT) benchmarks are limited to controlled laboratory environments with few distractions, clean backgrounds, and short-term occlusions. In this paper, we introduce a large-scale dataset designed for TPT in crowded and unstructured environments, demonstrated through a robot-person following task. The dataset is collected by a human pushing a sensor-equipped cart while following a target person, capturing human-like following behavior and emphasizing long-term tracking challenges, including frequent occlusions and the need for re-identification from numerous pedestrians. It includes multi-modal data streams, including odometry, 3D LiDAR, IMU, panoramic images, and RGB-D images, along with exhaustively annotated 2D bounding boxes of the target person across 48 sequences, both indoors and outdoors. Using this dataset and visual annotations, we perform extensive experiments with existing SOTA TPT methods, offering a thorough analysis of their limitations and suggesting future research directions.

artificial intelligence, deep learning, machine learning, (20 more...)

2505.07446

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (0.68)
Retail (0.67)
Education (0.67)
Consumer Products & Services > Restaurants (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.71)

arXiv.org Artificial IntelligenceMar-3-2025

RPF-Search: Field-based Search for Robot Person Following in Unknown Dynamic Environments

Ye, Hanjing, Cai, Kuanqi, Zhan, Yu, Xia, Bingyi, Ajoudani, Arash, Zhang, Hong

Autonomous robot person-following (RPF) systems are crucial for personal assistance and security but suffer from target loss due to occlusions in dynamic, unknown environments. Current methods rely on pre-built maps and assume static environments, limiting their effectiveness in real-world settings. There is a critical gap in re-finding targets under topographic (e.g., walls, corners) and dynamic (e.g., moving pedestrians) occlusions. In this paper, we propose a novel heuristic-guided search framework that dynamically builds environmental maps while following the target and resolves various occlusions by prioritizing high-probability areas for locating the target. For topographic occlusions, a belief-guided search field is constructed and used to evaluate the likelihood of the target's presence, while for dynamic occlusions, a fluid-field approach allows the robot to adaptively follow or overtake moving occluders. Past motion cues and environmental observations refine the search decision over time. Our results demonstrate that the proposed method outperforms existing approaches in terms of search efficiency and success rates, both in simulations and real-world tests. Our target search method enhances the adaptability and reliability of RPF systems in unknown and dynamic environments to support their use in real-world applications. Our code, video, experimental results and appendix are available at https://medlartea.github.io/rpf-search/.

occlusion, robot, target person, (16 more...)

2503.02188

Country:

North America > Canada > Alberta (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Italy (0.04)
(5 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Neural Information Processing SystemsJan-18-2025, 23:55:54 GMT

Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning

In this paper, we target image-based person-to-person virtual try-on in the presence of diverse poses and large viewpoint variations. Existing methods are restricted in this setting as they estimate garment warping flows mainly based on 2D poses and appearance, which omits the geometric prior of the 3D human body shape.Moreover, current garment warping methods are confined to localized regions, which makes them ineffective in capturing long-range dependencies and results in inferior flows with artifacts.To tackle these issues, we present 3D-aware global correspondences, which are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies. Particularly, given an image pair depicting the source and target person, (a) we first obtain their pose-aware and high-level representations via two encoders, and introduce a coarse-to-fine decoder with multiple refinement modules to predict the pixel-wise global correspondence. Extensive experiments on public benchmarks and our selected HardPose test set demonstrate the superiority of our method against state-of-the-art try-on approaches.

3d-aware global correspondence learning, hard-pose virtual try-on, target person, (2 more...)

Technology: Information Technology > Artificial Intelligence (0.65)

Neural Information Processing SystemsOct-9-2024, 14:51:04 GMT

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

Image-based virtual try-on is one of the most promising applications of human-centric image generation due to its tremendous real-world potential. Yet, as most try-on approaches fit in-shop garments onto a target person, they require the laborious and restrictive construction of a paired training dataset, severely limiting their scalability. While a few recent works attempt to transfer garments directly from one person to another, alleviating the need to collect paired datasets, their performance is impacted by the lack of paired (supervised) information. In particular, disentangling style and spatial information of the garment becomes a challenge, which existing methods either address by requiring auxiliary data or extensive online optimization procedures, thereby still inhibiting their scalability. To achieve a scalable virtual try-on system that can transfer arbitrary garments between a source and a target person in an unsupervised manner, we thus propose a texture-preserving end-to-end network, the PAtch-routed SpaTially-Adaptive GAN (PASTA-GAN), that facilitates real-world unpaired virtual try-on.

garment, patch-routed spatially-adaptive gan, scalable unpaired virtual try-on, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)

arXiv.org Artificial IntelligenceJun-21-2024

Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild

Orzech, Nadav, Nitzan, Yotam, Mizrahi, Ulysse, Danon, Dov, Bermano, Amit H.

Virtual Try-On (VTON) is a highly active line of research, with increasing demand. It aims to replace a piece of garment in an image with one from another, while preserving person and garment characteristics as well as image fidelity. Current literature takes a supervised approach for the task, impairing generalization and imposing heavy computation. In this paper, we present a novel zero-shot training-free method for inpainting a clothing garment by reference. Our approach employs the prior of a diffusion model with no additional training, fully leveraging its native generalization capabilities. The method employs extended attention to transfer image information from reference to target images, overcoming two significant challenges. We first initially warp the reference garment over the target human using deep features, alleviating "texture sticking". We then leverage the extended attention mechanism with careful masking, eliminating leakage of reference background and unwanted influence. Through a user study, qualitative, and quantitative comparison to state-of-the-art approaches, we demonstrate superior image quality and garment preservation compared unseen clothing pieces or human figures.

arxiv, garment, reference garment, (14 more...)

2406.15331

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)

arXiv.org Artificial IntelligenceMay-11-2024

Robot Detection System 3: LRF groups and Coordinate System

Lin, Jinwei

Front-following is more technically difficult to implement than the other two human following technologies, but front-following technology is more practical and can be applied in more areas to solve more practical problems. In this paper, we will analyze the detailed design of LRF groups, the structure and combination design of coordinate system of Robot Detection System. We use enough beautiful figures to display our novel design idea. Our research result is open source in 2018, and this paper is just to expand the research result propagation granularity. Abundant magic design idea are included in this paper, more idea and analyzing can sear and see other paper naming with a start of Robot Design System with Jinwei Lin, the only author of this series papers.

coordinate system, robot, spherical coordinate system, (14 more...)

2405.08022

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)